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^ (57) Abstract: The invention includes a handheld device for receiving audio input in the form of speech. The handheld device is 
Q specifically optimized for digitally recording speech input for the purpose of speech recognition. The handheld device includes an 
^ economically positioned pointing device to enable dictation and navigation through a document using only one hand. The handheld 
device may optionally include a memory device, a fingerprint security device, and a barcode scanner. 
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USB DICTATION DEVICE 

5 BACKGROUND OF THE INVENTION 

Dictation devices have been in common use in many fields in which it is inconvenient or 
undesirable to make handwritten or typewritten notes. One of the fields in which dictation 
devices have long been prevalent is the medical profession, and particularly so among 
radiologists, who often dictate their findings and diagnoses while referring to a photographic 

10 print of radiological data such as X-rays. It is common in these fields for a user of a dictation 
device to make voice recordings and provide these recordings to transcriptionists, who transcribe 
the recordings in order to generate written transcripts of the recording for the dictator's review or 
for record keeping purposes. 

More recently, dictation technology has developed significantly and includes such tools 

15 as speech recognition software to eliminate some of the need for transcriptionists to transcribe 
recordings. However, a recognized and pervasive problem in the art is that speech recognition 
requires high quality audio input. Low quality audio input decreases the effectiveness of speech 
recognition algorithms, and frequently prevents them from functioning at all. It is therefore 
desirable to provide a microphone with superior audio quality for connecting to a computer for 

20 implementing speech recognition. 

In some applications of dictation devices and speech recognition tools, a dictator uses a 
dictation device and speech recognition to complete blank text fields in a form. For example, 
many medical practices have specific forms wherein there is a printed query or prompt, followed 
by a blank text field into which a practitioner provides the requested text information regarding a 
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particular patient. One way in which different fields are selected and activated to receive text 
from a voice recognition interface is by use of a pointing device, such as a mouse. However, this 
manner of selecting various fields for text input is unduly awkward as typically a user would 
prefer to use the same hand to manipulate the microphone as the mouse. Thus, it is desirable to 
5 provide an economically convenient way to navigate through forms containing fields in which 
text is entered through a dictation device and a speech recognition interface. 

Technological advances have led to newer and faster types of interfaces between 
peripheral devices and computers. Dictation microphones for connecting to computers running 
speech recognition software have been known in the art. However, these microphones connect to 

10 computers through one or more serial ports, and often require other connections as well, 
including speaker in, audio in, speaker out, audio out, power, RS232, and game port connections. 
These multiple connections make connecting the microphone to a computer a time consuming 
and complex process. Furthermore, many dictation microphones require connection to a sound 
card, which many laptop and some desktop computers lack. Therefore, a substantial number of 

15 computers are unable to connect to these sorts of dictation microphones. However, the 
development of the USB and USB2 standards ("Universal Serial Bus" and "USB2" are both 
hereafter simply referred to as "USB") have brought significant increases in speed and aided in 
uniform c ompatibility b etween p eripheral d evices and c omputers. It is desirable to provide a 
dictation microphone that can be connected simply and easily to a computer through a USB 

20 connection, through which two way communication between the microphone and the computer is 
established, and from which the microphone can draw substantially all of its power requirements. 

In some applications a barcode identifier is used to identify the subject about which the 
user of a dictation device is dictating. For example, in the field of radiology, it is common for a 
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radiologist to use a scanner to scan a barcode on a radiograph, such as an X-ray, which identifies 
the patient and/or the X-ray. The radiologist then records his findings and diagnoses into the 
dictation device with the assurance that this recording will be associated with the correct patient 
and/or X-ray. It is therefore desirable to provide a dictation device that may include a barcode or 

5 other such scanning ability. 

A difficulty arises, however, because scanning devices generally, and in particular laser 
scanning devices, generate electromagnetic fields that can interfere with microphone circuitry 
and degrade audio signals, thus making speech recognition of those signals less accurate or 
impossible. Therefore, there is a need for circuitry that can overcome the effects of interference 

10 between the scanning and microphone elements in a dictation device having a scanner. It is 
desirable to have an integrated scanner/dictation device that can interface with a computer 
through a USB connection, and which can draw all of its power requirements from the USB port. 
However, the USB standard limits the amount of current that can be drawn to 500mA, which 
may be insufficient to drive both the scanner and the dictation device. 
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OBJECTS OF THE INVENTION 
In light of the above identified deficiencies of the prior art, an object of the present 
invention is to provide to provide a microphone with superior audio quality for connecting to a 
computer for implementing speech recognition. 
5 It is another object of the present invention provide an ergonomically convenient way to 

navigate through forms containing fields in which text is entered through a dictation device and a 
speech recognition interface. 

Yet another object of the present invention is to provide a dictation microphone that can 
be connected simply and easily to a computer through a USB connection, through which two way 
10 communication between the microphone and the computer is established, and from which the 
microphone can draw substantially all of its power requirements. 

Still another object of the present invention is to provide to provide a dictation device, that 
may include a barcode or other such scanning ability. 

Another object of the present invention to provide an integrated scanner/dictation device 
15 that can interface with a computer through a USB connection which can draw all of its power 
requirements from the USB port. 
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SUMMARY OF THE INVENTION 
The invention is a handheld device for receiving audio input in the form of speech, the 
audio input recorded in a recording medium and processed by a speech recognition engine, 
thereby generating text. The handheld device may comprise an omni -directional microphone 
5 element disposed at a distal end of the device for receiving the audio input in the form of speech 
and for generating an analog signal therefrom. The handheld device may further comprise a wind 
screen selected according to a predetermined wind noise sensitivity factor. The wind screen may 
be acoustically transparent and attenuate wind (air flow) noise for example with hard consonants, 
that create too great an analog signal and would otherwise disrupt the speech recognition. The 

10 handheld device may further comprise an electric circuit for receiving the audio input in the form 
of speech and for converting the analog signals to digital signals. The electric circuit may have a 
gain control for providing a signal level that is suitable for speech recognition. 

The handheld device is preferably used in conjunction with a software program that 
utilizes a speech recognition engine to fill in blank portions of forms, thus allowing a user to 

15 input text in a blank portion of a form by simply speaking the words to be entered in that portion. 
The handheld device may include a set of input buttons economically positioned on the top 
surface of the handheld device. The input buttons may include one or more buttons for 
selectively navigating through predetermined sections of a form, a button for selecting a 
predetermined section of a form, and a button for initiating recording. 

20 The navigation buttons may include a button for advancing to the next predetermined 

section of a form and a button for going back to a previous predetermined section of a form. The 
buttons may further include a select button for selecting the active element in a form. The 
navigation buttons may be used to navigate through various elements in a form by sequentially 
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activating those elements, then when a particular desired element is activated, the select button 
may be pressed to select the active element. Once an active element has been selected, text may 
be inserted into that element. The input buttons may include a button for initiating dictation. 
Preferably, text generated by the speech recognition engine is not immediately displayed as it is 

5 recognized because it has been found that for many users, this would be too distracting. Thus 
when dictation has been initiated by the user by pressing the dictation button, the text generated 
by the speech recognition engine is held in a buffer until the user presses an "insert text" button. 
When the user presses the "insert text" button, the contents of the buffer are displayed in the 
selected element of the form. In an alternative embodiment, the text generated by the speech 

10 recognition engine may be displayed immediately after the speech has been recognized and while 
the user is dictating. 

Other functions of the buttons may include selecting a portion of text, playing a portion of 
an audio file, a stop button, a button for reviewing back through an audio file to a previously 
recorded portion of the file, a button for advancing through an audio file, and a button for playing 

15 a portion of an audio file through an integrated speaker on the handheld device. In one 
embodiment, the button for playing a portion of an audio file is also a stop button, such that if 
any function of the microphone is active, pressing the play/stop button will stop any active 
function. If no function of the microphone is active, pressing the play/stop button will play a 
portion of an audio file. T he handheld device may also include a button for dictation which 

20 when p ressed a ctivates t he m icrophone e lement t o r eceive a udio i nput in the form of speech. 
When this element is active, the audio input is digitized and transmitted to the speech recognition 
engine, which then translates the speech into text. 
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The handheld device may also include one or more buttons with programmable functions. 
These buttons may be assigned a function that may generally be quite complex by way of user 
recorded or prerecorded macros. In a preferred embodiment, the handheld device includes a 
"signature" button for electronically signing a form after it has been filled out. Actuation of this 
5 button designates the current document signed, the current document is saved and further editing 
is blocked. For example, if the handheld device is being used by a medical doctor, the doctor 
may complete a form and review it for accuracy, then press the "signature" button, which adds a 
marker to the document indicating that it has been completed and reviewed, and that the doctor 
acknowledges that the information contained in the document is true and complete. After the 

10 "signature" button is pressed, the document may be saved and marked as read-only to prevent 
further editing of the document. 

Another button that would be particularly useful for medical professionals is a "coding" 
button. When the "coding" button is pressed, a dialog box may appear containing diagnosis 
numbers. Particular maladies are typically assigned unique codes. A doctor can press the 

15 "coding" button and view a menu or other dialog box and select a numerical code corresponding 
to the diagnosis. This code may then be appended to the document The numerical codes 
simplify billing procedures because most insurance carriers rely on these codes rather than a 
textual description of the maladies treated in order to determine how much to pay for particular 
procedures. The codes may also provide a database to ease the process of searching, for 

20 example, for patients with particular maladies. Further details on coding operations can be found 
in co-pending U.S. Provisional P atent Application S erial N o. 6 0/436,456, filed D ecember 2 7, 
2002, incorporated herein by reference in its entirety. 
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Preferably, the buttons are economically positioned to allow one-handed navigation 
through the predetermined sections of a form, selection of a predetermined section of a form, 
initiation of recording, and insertion of text. The ergonomic positioning of the buttons preferably 
allows one-handed operation of every function available on the microphone. 
5 The handheld device may include a pointing device on its top surface that may be 

actuated by the user's thumb. Preferably, the pointing device is a microjoystick or thumbstick, 
although it may also be a trackball or touchpad, or any other suitable pointing device. The 
pointing device has associated with it at least one button. Preferably, there is a button on the top 
of the handheld device, just below the pointing device. Most preferably, there is a button on the 

10 bottom of the handheld device, beneath where a user's index finger would naturally rest, such 
that advanced mouse maneuvers such as drag-and-drop may be performed by holding down the 
button with the index finger while moving the pointing device with the thumb. 

In one aspect, the invention is a handheld USB device for receiving audio input in the 
form of speech. The audio input may be recorded in a recording medium and processed by a 

15 speech recognition engine, thereby generating text. The handheld device may include a USB hub 
for receiving and transmitting signals through a USB interface to a USB root hub in a computer. 
It may further include a power switch for switching power drawn from the USB interface 
between a first USB port on the USB hub and a second USB port on the USB hub. It may further 
include a pointing device connected to the second USB port on the USB hub. It may further 

20 include a USB streaming controller connected to the first USB port on the USB hub. The USB 
streaming controller may receive digital audio signals from an audio codec, which converts 
analog audio signals from a microphone element on the handheld device into the digital audio 
signals. 
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Preferably, the handheld USB device includes a preamplifier between the audio codec 
and the microphone element for amplifying the analog audio signals from the microphone 
element before they are digitized by the audio codec into digital audio signals. The handheld 
USB device also preferably includes a speaker. Digital audio signals from the USB streaming 
5 controller may be converted into analog signals by the audio codec. These analog signals may 
then be amplified by an amplifier, and the amplified analog audio signals then used to drive the 
speaker. 

The handheld USB device may include buttons connected to switches, such as USB 
human interface devices. The buttons and switches may be arranged such that when a button is 

10 pressed, its corresponding switch is closed. Alternatively, when a button is pressed, its 
corresponding switch may be opened, if the default position of the switch is closed. The USB 
streaming controller may be used to detect whether a switch is opened or closed, and thereby may 
send a signal through a USB port on the USB hub, which then transmits the signal through a 
USB interface to a USB root hub on a host computer. The USB streaming controller may also be 

15 connected to one or more LEDs, which the USB streaming controller may light in response to a 
signal indicating that a button has been pressed. Preferably, the USB streaming controller lights 
an LED when the button corresponding to the "record" function of the handheld USB device is 
pressed. 

In one embodiment, the handheld USB device may include a data storage device. 
20 Preferably, the data storage device is a memory stick or a SmartMedia™ card. In other 
embodiments, the data storage device may be any data storage means known to those in the art. 
In another embodiment, the handheld USB device may include a fingerprint security device. 
Preferably, the fingerprint security device will lock out any unauthorized user, thus preventing 
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unauthorized users to use the handheld USB device to create or alter medical records. The 
handheld USB device may further include a barcode scanner for scanning barcodes, for example, 
on medical records to identify the patient, or on pharmaceutical packaging to ensure patients 
receive the correct medicines. 

The above advantages and features are of representative embodiments only, and are 
presented only to assist in understanding the invention. It should be understood that they are not 
to be considered limitations on the invention as defined by the claims, or limitations on 
equivalents to the claims. Additional features and advantages of the invention will become 
apparent from the drawings, the following description, and the claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

While the specification concludes with claims particularly pointing out and distinctly 
claiming the present invention, it is believed the same will be better understood from the 
following description taken in conjunction with the accompanying drawings, which illustrate, in 
5 a non-limiting fashion, the best mode presently contemplated for carrying out the present 
invention, and in which like reference numerals designate like parts throughout the figures, 
wherein: 

FIG. 1 is a top view of one embodiment of the dictation device of the invention. 
FIG. 2 is a bottom view of one embodiment of the dictation device of the invention. 
10 FIG. 3 is a block diagram of one embodiment of the dictation device of the invention 

without an integrated scanner. 

FIG. 4 is a block diagram showing the data flow among the various components of one 
embodiment of the dictation device of the invention. 

FIG. 5 is a block diagram of one embodiment of the dictation device of the invention with 
1 5 an integrated scanner. 

FIG. 6 is a block diagram of one embodiment of the dictation device of the invention with 
fingerprint security and memory stick features. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
The present invention encompasses dictation microphones and dictation microphones 
with integrated scanning units. The invention includes a novel ergonomic layout for facilitating 
use of the dictation microphones with application software implementing voice recognition 

5 algorithms to reduce voice data to text and to place the text in appropriate blank fields in a form. 
The novel ergonomic layout further includes a pointing device with buttons positioned to 
facilitate pointing device functionality, such as drag-and-drop. The invention solves a number of 
problems associated with prior art methods of interfacing dictation devices with computers and 
provides a convenient USB interface. The invention further solves power limitation problems 

10 associated with the integration of dictation devices with scanning devices using a USB interface. 

It is known in the art that hard consonant sounds, like -d, -t, -p, -b, -x, and -z have higher 
instantaneous energies, that is they create higher wind speed than other sounds, and can reduce a 
microphone's fidelity, and thus cause speech recognition rates to drop. This effect is deemed 
"Wind Noise Sensitivity" (WNS). The inventors have developed a test for determining whether 

15 a microphone and windscreen combination is adequately resistant to these wind noises. For an 
electret microphone with an output sensitivity of -44 dB (1 KHz, 0 dB = 1 V/pA), the WNS 
should be less than -60 dB at a wind speed of 1 meter/sec in order to provide an audio fidelity 
adequate for voice recognition. It is anticipated, however, that increasingly sophisticated voice 
recognition algorithms may overcome the limitations of noise introduced by hard consonant 

20 sounds, and that a greater WNS would be adequate. One aspect of this invention is a method for 
determining whether a particular microphone and windscreen combination is adequate for use 
with a voice recognition algorithm. 

-12- 
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FIG. 1 shows a top view of an embodiment of the microphone of the invention 101. The 
microphone element 102 may be protected by a windscreen consisting of a layer of foam padding 
material, preferably polyurethane foam, with a thickness of 0.5 to 1.6 cm and a porosity of 
between 60 and 90 ppi (pores per inch). Preferably, the windscreen is a polyurethane foam with 

5 a porosity between about 100 and 130 ppi and a thickness of between 0.6 cm and 1 cm, most 
preferably 0.9 cm. Most preferably, the windscreen minimizes the wind noise sensitivity of the 
microphone yet has a flat and high transmission profile across the audible frequency range 
(approximately 20 Hz to 20 kHz). Preferably, the windscreen is affixed to the microphone by 
being placed such that its sides are pinched between the microphone casing and the microphone 

10 element itself in such a manner as to make it flush with the surface of the microphone. This 
design prevents shear forces from detaching the windscreen. Preferably, glue is not used to affix 
the windscreen to the microphone because the glue could dislodge and get into the microphone 
element, thus degrading the analog signal from the microphone and making speech recognition 
difficult or impossible. 

15 The microphone 102 is preferably a close-talking microphone. Preferably, the 

microphone element is adapted to receive audio input from a speaker whose mouth is between 
0.5 and several inches from the microphone element. Preferably, the microphone element can be 
as much as 45 degrees of mouth axis. The large variation in frequency response of unidirectional 
and noise-canceling microphones under those conditions make these microphones ill-suited for 

20 this application because of the resulting decrease in the accuracy of speech recognition. 
However, these microphones may be used in this application if users are cautious about how they 
hold the microphone, or if the microphone is mounted in a fixed relation to the user's mouth. 
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The microphone element is preferably an omni-directional microphone with a frequency response 
that is substantially flat over a range of angles and distances. 

A plurality of buttons 104-1 1 1 allow the user to control the dictation functions. A "next" 
button 104 allows the user to advance to the next field in a form. A " previous" b utton 1 05 
5 allows the user to go back to the previous field in a form. A "select" button 106 allows the user 
to select the cun-ent field in a form. An "insert text" button 107 allows the user to insert text at 
the current position of the cursor that has been stored in a buffer during dictation. A "dictate" 
button 108 begins the recording process, allowing the user's speech to be recorded. An LED 
indicator 103 is lit when the microphone is recording. A "fast forward" button 109 allows the 

10 user to skip through previously recorded speech to search for a particular portion. A "review" 
button 110 allows the user to skip backward through previously recorded speech to search for a 
particular portion. A combination "stop/play" toggle button either allows the user to play 
previously recorded speech, or to stop playback of previously recorded speech or any other active 
function of the microphone. 

15 A thumbstick pointing device 1 12 is included to allow the user to navigate through a form 

document or for any other reason that a mouse pointing device is normally user for. The 
thumbstick pointing device also includes a mouse button 113. In a preferred embodiment, the 
pointing device is a force sensing resistor micro joystick pointing device. In alternative 
embodiments, the thumbstick pointing device can be any other suitable pointing device, such as a 

20 trackball. 

Two additional buttons (not shown) are optionally present. Generally, these two buttons 
can be programmable to meet individual users' needs. In a preferred embodiment, actuation of 
one of the additional buttons executes a software routine to provide an electronic signature for 
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signing forms after they have been filled out using the microphone and voice recognition 
technology. In this embodiment, actuation of the signature button adds a "signed" notation to the 
form after it has been filled out, saves the completed form, and marks it read-only. In this 
embodiment, actuation of the other additional button executes a software routine to provide a 
5 menu of codes corresponding to a malady which a doctor has diagnosed in a patient. The doctor 
selects the code corresponding to her diagnosis, and the code may be associated with the form or 
forwarded to the patient's insurer for billing purposes. Either of these two additional buttons 
may be programmed to have an arbitrary function. For example, one of the buttons may be 
programmed to automatically generate and send an email message containing a recently recorded 

10 digital audio file recently recorded on the handheld dictation device to a predetermined recipient, 
as disclosed in co-pending U.S. Patent Application Serial No. 09/099,501, entitled "Dictation 
System Employing Computer-to-Cotnputer Transmission of Voice Files Controlled by Hand 
Microphone," filed June 8, 1998, and incorporated herein by reference. 

FIG. 2 shows a bottom view of an embodiment of the microphone of the invention 201. 

15 A speaker 202 may be included for playing back previously recorded speech or other wav files. 
A button 203 may be used in conjunction with the thumbstick pointing device 112 and is 
preferably corresponds to a left mouse button. This allows the user to manipulate the thumbstick 
pointing device 112 while simultaneously holding down the button 203, thus allowing for mouse 
operations such as drag-and-drop. At the base of the microphone 204 may be a slot for 

20 removable memory. Also included may be a space 205 for an integrated barcode scanner. 

FIG. 3 shows a block diagram of an embodiment of the microphone of the invention. In 
this embodiment, a host PC 301 having a USB root hub 302 may be connected to the microphone 
303 through a USB connector 305. The USB connector connects to a USB hub controller 304 in 
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the microphone 303. The USB hub controller 304 in the microphone 303 can interpret the 
various signals coming from the elements on board the microphone 303 in order to send those 
signals in a meaningful way to the host PC 301. The USB hub controller 304 thus allows 
multiple elements in the microphone 303 to send signals to the host PC without the signals 

5 becoming intractably entangled. In a preferred embodiment, the USB hub controller 304 is 
comprised of a Texas Instruments™ TUSB2036 2- or 3-port USB hub chip. A power supply 
voltage regulator 307 may convert +5V routed from the USB connection to the +3.3V needed to 
power the USB hub controller chip 304. In a preferred embodiment, the USB hub controller 304 
is powered by a Texas Instruments™ TPS78833 power supply voltage regulator 307. 

10 A power switch 306 may provide power management for the downstream devices in 

order to comply with USB power management requirements. Preferably, the power switch 306 is 
a Texas Instruments™ TPS2044 chip. Both output ports of power switch 306 are tied together, 
and the +5V output is directed to two power supply voltage regulators 308 and 309. In the event 
of an overcurrent, the power switch 306 can switch off the power to the power supply voltage 

15 regulators 308 and 309. When the power switch 306 is active, thus sending +5V to the power 
supply voltage regulators 308 and 309, the power supply voltage regulators may convert +5V 
from the USB connection to the +3.3V needed to power the downstream elements. Regulators 
308 and 309 may independently provide +3.3V to the digital components of the handheld device 
(regulator 308) and the analog components of the handheld device (regulator 309). The 

20 independence of the two regulators allows a highly uniform voltage source to be provided to the 
analog elements (the microphone 314, speaker 315, and amplifiers 319 and 320), regardless of 
the power requirements of the digital components (the EEPROM 311, the audio CODEC 318, 
and the streaming controller 312). 
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Under the USB standard, only 500mA of current can be drawn from the USB connection. 
Preferably, high-power functions will draw less than 100mA at power up. Thus, the power 
switch 306 serves the additional function of shutting down the downstream devices if they 
attempt to draw more than this maximum amount of current Note that the power to the USB 
5 hub controller chip is preferably powered by an independent power supply voltage regulator 307 
that is not switched by the power switch 306. In a preferred embodiment, the USB streaming 
controller 312 is attached to a current sensing device or other sensor 321. When the sensor 321 
senses an overcurrent or the activation of an element with excessive current requirements, the 
USB streaming controller 312 can send a signal 322 to the speaker amplifier 320 that mutes the 

10 speaker 315, thus reducing the current requirements of the handheld device. In the embodiment 
of the invention shown in Fig. 3, there is no barcode scanner, and thus it is not expected that a 
current in excess of 500mA would ever be required. Thus the sensor element 321 is not strictly 
necessary in this embodiment. However, the mute signal 322 may still be sent by the host 
computer 301 when the recording mode is activated in order to prevent sounds from the speaker 

15 3 1 5 to be picked up by the microphone 3 1 4 during dictation. 

In a preferred embodiment, the dictation microphone 303 has an on-board pointing 
device, preferably a microjoystick pointing device 310. The pointing device 310 is connected to 
one of the ports of the USB hub controller 304. Microjoysticks that may be used with the present 
invention can be obtained from Interlink Electronics™. 

20 The microphone 303 further includes an electrically erasable programmable read only 

memory (EEPROM) 311 for storing instructions including CODECs and input and output data 
bit rates for the USB streaming controller 312. The EEPROM 31 1 is connected to the I2C port 
of the USB streaming controller 312. The EEPROM can be programmed or reprogrammed by 
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way of signals from the USB root hub 302 through the USB connection 305 to the USB hub 304 
and the USB streaming controller 312 using for example, a device firmware upgrade utility 
which is compliant with USB Device Class Specification for DFU 1.0. Preferably, the EEPROM 
is a Microchip Technology Inc.™ 24LC64 chip. 
5 A USB streaming controller 3 12 receives the input from several switches 313 and the 

microphone element 314, and provides the output to the on-board speaker 315 and two light- 
emitting diodes (LEDs) 316 and 317. Each of the LEDs may be lit under certain specific 
circumstances, for example, when the handheld device is plugged in one LED may be lit to 
indicate the device is ready to be used, and the other LET may be lit during the recording or 

10 dictating process. The switches 313 correspond to the buttons 104-1 1 1 described with reference 
to FIG. 1, as well as the two programmable buttons, such that actuation of any of those buttons 
results in the closure of the corresponding switch. In a preferred embodiment, the USB 
streaming controller is a Texas Instruments™ TAS1020A chip. 

Also connected to the USB streaming controller 312 is an audio coder/decoder (CODEC) 

1 5 for converting analog signals from the microphone 314 into digital signals that can be transmitted 
to the PC host 301 or stored in an optional on-board memory (not shown). Between the 
microphone 314 and the audio CODEC 318 is a fixed gain front end amplifier 319 for amplifying 
the analog signal from the microphone to take advantage of a greater dynamic range before the 
signal is digitized. The audio CODEC 318 also serves the function of converting digital signals 

20 from the PC host 301 or stored in an optional on-board memory (not shown) into analog signals, 
which can be played on the on-board speaker 315. Between the audio CODEC 3 18 and the 
speaker 315 is an amplifier 320 for amplifying the analog signal to produce an adequate volume 
at the speaker 315. In a preferred embodiment, the audio CODEC 318 is a Wolfson™ WM9707 
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chip, the fixed gain front end amplifier 319 is a National Semiconductor™ LMV110 chip, and 
the speaker amplifier is a 0.35W mono audio power amplifier comprising a Texas Instruments™ 
TPA301 chip. Also in this preferred embodiment, the microphone 314 is an electric condenser 
microphone such as Panasonic's™ WM-52M and the speaker 315 is a Panasonic™ EAS2P104H 
5 micro speaker. 

FIG. 4 is a block diagram showing the data flow through three conceptual "layers" - a 
function layer 410, a USB device layer 460, and a USB bus interface layer 490. Solid arrows 
represent data flow and hollow arrows represent logical communications flow, or instructions 
provided from one component to another. The client software 400 can be any voice recognition 

10 software, especially voice recognition software as part of a larger application that allows forms 
with blank fields to be retrieved and where the blank fields can be filled in using voice 
recognition. The USB system software 450 and USB host controller 480 can be any suitable 
USB system and host controller software, for example those provided with typical operating 
systems having USB capability, such as Windows™ or Linux. The USB hub 490 is internal to 

15 the dictation device, and corresponds to element 304 in FIG. 3 for separate control of a USB 
Streaming Controller and a pointing device controller. The USB device layer includes a USB 
streaming controller 470 and the USB pointing device controller 465, each with a data 
connection to the USB hub 490. 

The function layer includes an audio function 430, for receiving analog audio data from 

20 the microphone physical device and converting the analog data to digital for transmission 
through the USB streaming controller 470 to the USB hub 490, through the USB host controller 
480, through the USB system software 450, to the client software 400, where voice recognition 
algorithms converts the audio data to text, and places that text in the appropriate place, for 
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example in a form document. Within the audio function 430 is the capability to output to the 
speaker physical device. The client software sends digitally encoded audio data through the USB 
system software 450, the USB host controller 480, the USB hub 490, the USB streaming 
controller 470, to the audio fiinction, which converts the digital data to analog, which is then 
5 amplified, and the amplified signal drives the speaker physical device. 

The function layer also includes several input/output (I/O) functions 420, for receiving 
input in the form of an identity of an actuated button. This data is likewise transmitted through 
the USB streaming controller 470, the USB hub 490, the USB host controller 480, the USB 
system software 450, to the client software 400, which recognizes the actuated button as 

10 corresponding to a command to be performed in the client software application. 

The function layer also includes a pointing function 415, which receives data from the 
pointing physical device and buttons. The pointing data is routed through the USB micro 
joystick controller 465, through the USB hub 490, through the USB host controller 480, through 
the USB system software 450, to the client software 400, where the pointing data is used, for 

15 example, to navigate through a form having blank fields to be filled in by a user. 

FIG. 5 is a block diagram of an embodiment of the microphone of the invention having an 
integrated scanner. The basic design is similar to that shown in figure 3, except that there is 
additionally a scanning device attached to one of the ports 505 of the USB hub. A host PC 301 
having a USB root hub 302 may be connected to the microphone 303 through a USB connector 

20 305. The USB connector connects to a USB hub controller 304 in the microphone 503. The 
USB hub controller 304 in the microphone 503 can interpret the various signals coming from the 
elements on board the microphone 303 in order to send those signals in a meaningful way to the 
host PC 301. The USB hub controller 304 thus allows multiple elements in the microphone 303 
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to send signals to the host PC without the signals becoming intractably entangled In a preferred 
embodiment, the.USB hub controller 304 is comprised of a Texas Instruments™ TUSB2036 2- 
or 3-port USB hub chip. 

The scan engine 520 may be an integrated device such as the SE923-I00A, which has an 
5 RS232 output. The output from the scan engine should be converted from the RS232 standard to 
the USB standard, for example, by use of an RS232 to USB converter 510 such as an 
FT8U232AM chip. In a preferred embodiment, the USB streaming controller 312 is attached to 
a current sensing device or other sensor 321. When the sensor 321 senses an overcurrent or the 
activation of an element with excessive current requirements, for example, when the scanning 
10 element 520 is activated, the USB streaming controller 312 can send a signal 322 to the speaker 
amplifier 320 that mutes the speaker 315, thus reducing the current requirements of the handheld 
device. 

FIG. 6 is a block diagram of one embodiment of the dictation device of the invention with 
fingerprint security and memory stick features. The basic design of the microphone unit 603 is 

15 similar to that shown in figure 3, except that there is an additional USB hub controller 610. The 
additional USB hub controller may be part of the microphone unit itself, or it may be part of a 
separate base unit 615 that is attached to the computer 301. The remainder of the description of 
this embodiment presumes that there is a separate base unit 615 attached to the computer through 
the USB root hub 302. I n this embodiment, additional supply voltage regulators 6 50 and an 

20 additional power switch 620 provides power to the base unit 615. 

The base unit may further have an additional USB hub controller 610. One part of the 
additional USB hub controller 610 may be connected to the USB hub controller for the 
microphone and pointing device 304, as described above in the text accompanying figure 3. The 
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other two ports of the additional USB hub controller 610 may be used for other features. In this 
embodiment, one of the other two ports of the additional USB hub controller 610 may 
accommodate additional memory, for example, through a memory stick such as SmartMedia™ 
or any other memory device, so that dictation may be digitally stored within the microphone 603, 
5 base unit 615, or both. The remaining port of the additional USB hub controller 610 may 
accommodate a fingerprint identification device 640. A fingerprint identification device 640 
would only allow authorized users to use a microphone of the invention to dictate and sign 
medical reports. 
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CLAIMS 

What is claimed is: 

1. A handheld device for receiving audio input in the form of speech, the audio input 
recorded in a recording medium and processed by a speech recognition engine, thereby 
generating text, the handheld device comprising: 

an omni-directional microphone element disposed at a distal end, 
a wind screen selected according to a predetermined wind noise sensitivity factor, 
an electric circuit for receiving the audio input in the form of speech and for converting 
analog signals to digital signals, the electric circuit having a gain control for providing a signal 
level that is suitable for speech recognition, and 

a set of input buttons economically positioned on a top surface of the handheld device, 
wherein at least one of the input buttons is for selectively navigating through predetermined 
sections of a form, and the set of input buttons including a button for selecting a predetermined 
section of a form and a button for initiating recording, 

wherein the at least one of the input buttons for selectively navigating predetermined 
sections of the text, the button for selecting a predetermined section of a form, and the button for 
initiating recording are positioned to allow one-handed navigation through the predetermined 
sections of a form, selection of a predetermined section of a form, and initiation of recording. 

2. The handheld device of claim 1, wherein the set of input buttons includes a button for 
advancing to a next predetermined section of a form and a button for going back to a previous 
predetermined section of a form. 
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3. The handheld device of claim 1, wherein the set of input buttons includes a button for 
selecting a portion of text. 

4. The handheld device of claim 1, wherein the set of input buttons includes a button for 
reviewing back through an audio file to a previously recorded portion of the file, a button for 
advancing through an audio file to a previously recorded portion of the file, and a button for 
playing a portion of an audio file through an integrated speaker on the handheld device. 

5. The handheld device of claim 4, wherein the button for playing a portion of an audio file 
is also a stop button, wherein when any function of the handheld device is active, pressing the 
stop button deactivates the handheld device. 

6. The handheld device of claim 1, wherein the set of input buttons includes a button for 
dictation, wherein when the button for dictation is pressed, the microphone element is activated 
to receive audio input in the form of speech, the audio input is digitized and transmitted to the 
speech recognition engine, and the speech recognition engine translates the speech into text. 

7. The handheld device of claim 1, further comprising a pointing device on the top surface 
of the handheld device and at least one button associated with the pointing device. 

8. The handheld device of claim 7, wherein the pointing device is a microjoystick. 

9. The handheld device of claim 7, wherein the at least one button associated with the 
pointing device is on the bottom surface of the handheld device. 
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10. The handheld device of claim 1, wherein the set of input buttons includes a button for 
signing a form, wherein when the button for signing a form is pressed, the form is modified to 
include an indicator that the form has been signed. 

11. The handheld device of claim 1, wherein the set of input buttons includes a button for 
coding. 

12. A handheld device for receiving audio input in the form of speech, the audio input 
recorded in a recording medium and processed by a speech recognition engine, thereby 
generating text, the handheld device comprising: 

a USB hub for receiving and transmitting signals through a USB interface to a USB root 
hub of a computer, 

a power switch for switching power drawn from the USB interface between a first USB * 

port on the USB hub and a second USB port on the USB hub, 

a pointing device connected to the second USB port on the USB hub, and 

a USB streaming controller connected to the first USB port on the USB hub, wherein the 

USB streaming controller receives digital audio signals from an audio codec that converts analog 

audio signals from a microphone element into the digital audio signals. 

13. The handheld device of claim 12, further comprising a preamplifier between the audio 
codec and the microphone for amplifying the analog audio signals from the microphone before it 
is digitized by the audio codec. 

14. The handheld device of claim 13, further comprising a speaker, wherein digital audio 
signals from the USB streaming controller are converted into analog audio signals by the audio 
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codec, the analog audio signals are amplified by an amplifier, and the amplified analog audio 
signals drive the speaker. 

15. The handheld device of claim 13, further comprising a plurality of buttons connected to a 
plurality of switches, wherein when a button is pressed a corresponding switch is closed, and the 
USB streaming controller detects the closed switch. 

16. The handheld device of claim 15, further comprising at least one LED connected to the 
USB streaming controller, wherein when one of the plurality of the buttons is pressed and the 
corresponding switch is closed, a recording function of the handheld device is activated and the 
at least one LED is lit. 

17. The handheld device of claim 12, further comprising a data storage device. 

18. The handheld device of claim 12, further comprising a fingerprint security device. 

19. The handheld device of claim 17, further comprising a fingerprint security device. 

20. A handheld device for receiving audio input in the form of speech, the audio input 
recorded in a recording medium and processed by a speech recognition engine, thereby 
generating text, the handheld device comprising: 

a USB hub for receiving and transmitting signals through a USB interface to a USB root 
hub of a computer, 

a power switch for switching power drawn from the USB interface between a first l)SB 
port on the USB hub and a second USB port on the USB hub, 

a pointing device connected to the second USB port on the USB hub, 
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a USB streaming controller connected to the first USB port on the USB hub, wherein the 
USB streaming controller receives digital audio signals from an audio codec that converts analog 
audio signals from a microphone element into the digital audio signals, 

a barcode scanner for scanning barcodes to identify the subject matter of the audio input. 

21. The handheld device of claim 20, further comprising a preamplifier between the audio 
codec and the microphone for amplifying the analog audio signals from the microphone before it 
is digitized by the audio codec. 

22. The handheld device of claim 21, further comprising a speaker, wherein digital audio 
signals from the USB streaming controller are converted into analog audio signals by the audio 
codec, the analog audio signals are amplified by an amplifier, and the amplified analog audio 
signals drive the speaker. 

23 . The handheld device of claim 2 1 , further comprising a plurality of buttons connected to a 
plurality of switches, wherein when a button is pressed a corresponding switch is closed, and the 
USB streaming controller detects the closed switch. 

24. The handheld device of claim 23, further comprising at least one LED connected to the 
USB streaming controller, wherein when one of the plurality of the buttons is pressed and the 
corresponding switch is closed, a recording function of the handheld device is activated and the 
at least one LED is lit. 

25. The handheld device of claim 20, further comprising a data storage device. 

26. The handheld device of claim 20, further comprising a fingerprint security device. 
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The handheld device of claim 25, further comprising a fingerprint security device. 



-28- 



WO 2004/098158 



1/6 



PCT/US2004/012978 




WO 2004/098158 



2/6 



PCT/US2004/012978 




WO 2004/098158 



3/6 



PCT/US2004/012978 




CO 

CD 



CO 



<<<«<« 



UJ 

CO T" 

S CO 
CO 



CD 
Q Q 



LU 
H 

2 



CM 
CO 



WO 2004/098158 



4/6 



PCT/US2004/012978 




WO 2004/098158 



5/6 



PCT/US2004/012978 





^3 
O po 
H-H-csJ 

m ™ 

■I 

o 
o 



//////// 



LU 

CO 

uu 

CO 



'CNJ 
CO 



2 

UJQ 



rtHhp 

CO 



CD 



CNJ 
S CM 
X CO 




WO 2004/098158 



6/6 



PCT/US2004/012978 



DICTAPHONE SUPERMIC-FF BLOCK DIAGRAM 
301 — , 



615 



302 



BUS+5V 



PC 



USB 
ROOT HUB 



POWER SWITCH 
TPS2044 

P0RT1 P0RT2 P0RT3 

t m 

620 
630- 



610 



1 



650 



5V DS BASE 



UPSTREAM USB HUB ( 


BASE)TI'S 


TUSB2036 DOWNS' 


[REAM 


P0RT1 P0RT2 


P0RT3 



VR 



•3.3VD(BASE) 



VR *3.3VA(BASE) 



£ 



MEMORY 
STICK 



640 « 



FINGER 
PRINT 





1 12 






POWER SWITCH 




306^* 


TPS2044 






P0RT1 P0RT2 






VR 



UPSTREAM USB HUB (MIC) 
TI'S TUSB2036 
DOWNSTREAM 

P0RT1 P0RT2 



311 

s 



-3.3VD(MIC) 

s 

308 
VR|-3.3VA(MIC). 

309 



•304 



USB STREAMING CONTROLLER 
TrSTAS1020A 



EEPROMK ^nC 312 

AC97 P3.0 P3.1 P32 P3.3 P3[4:5] P1[0:7J 



318- 



I 



AUDIO CODEC 
WOLFSON 
WM9707 




RED GREEN 
LED LED 




SENSE 

s 

321 
^317 
316 



dm 

w 



POINTING 
DEVICE 



'310 



•313 



FIG. 6 



S 
603 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



